234
Bioinformatics of the Brain
9.4.1.3
Modularity Measures
Modularity parameter provides a measure of goodness of the discovered mod-
ules in a network. Given a graph G = (V, E), percentage of edges in a module
Mi can be stated in Eqn. 9.9, giving the percentage of edges inside a module
eii = |{(uv) : u ∈Vi, v ∈Vi, (u, v) ∈E|/|E|
(9.9)
Let ai be the percentage of edges with at least one end in Mi specified as
in Eqn. 9.10:
ai = |{(uv) : u ∈Vi, v ∈V \ Vi, (u, v) ∈E|/|E|
(9.10)
Modularity of a complex network with k modules can now be defined as
in Eqn. 9.11, providing the quality of all modules discovered in the network.
In other words, we evaluate the ratio of edges totally contained in modules to
the edges that have one end in a module.
Q =
k
i=1
(eii −a2
i )
(9.11)
Based on the modularity parameter, modularity-based clustering algo-
rithm proposed by Newman has the following steps. This algorithm with time
complexity O(m + n)n) is classified as agglomerative hierarchical clustering
algorithm as it iteratively forms larger clusters.
1.
Each node v ∈V is a cluster initially.
2.
Repeat
•Merge two clusters Ci and Cj that will increase modularity M
by the largest amount.
3.
Until merging any two clusters does not improve M.
9.4.1.4
Spectral Clustering
This method of graph clustering uses algebraic properties of the graph that
represents the complex network. The Laplacian matrix of a graph G = (V, E)
is L = D −A where D is the degree matrix with di ∈D is a diagonal
element representing degree of vertex i and A is the adjacency matrix. In
normalized form, L = I −D1/2AD−1/2. The eigenvalues of L are real and
symmetric and the second eigenvalue of L is called the Fiedler value and the
corresponding eigenvector is denoted as Fiedler vector. The spectral bisection
procedure using Fiedler vector is defined as follows. Having constructed the
Fiedler vector, each entry is compared with a threshold value, commonly 0
or the median value of the Fiedler vector, and a vertex that has a smaller or
equal value to the threshold is placed in cluster 1; any vertex with a larger
value is stored in cluster 2. This algorithm may be implemented recursively
to discover k clusters.